Apparatus, system, and method of providing mobile electronic retail purchases

ABSTRACT

An apparatus, system and method for an object-recognizing retail purchase system. The apparatus, system and method may include an automatically adjustable camera rig comprising a plurality of movable cameras, wherein the plurality of movable cameras are automatically moved by a camera control platform according to characteristics of an object within a view field of the plurality of movable cameras; a first input for receiving images from the automatically adjustable camera rig; and a second input for receiving a plurality of scraped network images regarding a plurality of purchasable objects. Additionally included may be a first computing memory for storing an object profile for each of the plurality of purchasable objects, wherein each of the object profiles comprises at least data from the first input regarding the object within the field of view and data from the second input; and a purchasing platform at least partially present on a mobile device.

CROSS-REFERENCE TO RELATED APPLICATION

The present invention claims priority to U.S. Provisional ApplicationNo. 62/677,895 filed May 30, 2018, the entirety of which is incorporatedby reference herein.

BACKGROUND Field of the Disclosure

The disclosure relates generally to electronic commerce, and, moreparticularly, to an apparatus, system, and method of providing mobileelectronic retail purchases.

Background of the Disclosure

It is highly desirable in the modern economy that object recognition,such as using a mobile device, be available for retail mobileapplications. That is, there is a need to enable a mobile device user tocapture a product image, have that image be recognized by a retailmobile application, and have the correct product reflected in thepicture taken by the user be offered to the user for purchase.

More particularly, the global economy is, at present, moving everincreasingly away from so-called “brick and mortar” purchases to onlinepurchases. However, online purchases are limited at present to onlythose products that a prospective purchaser can find online. That is, auser must know that a retailer has a particular product available, suchas via a Google search, a visit to a retailer website, or a searchthrough a retailer application, to discern whether a retailer has aparticular product available. Of course, the need for this level ofaffirmative user interaction may cost a retailer a significant number ofsales, at least in that users will typically purchase an item,particularly a relatively inexpensive fungible item, from the firstplace at which the user finds the item available online.

To accurately create image recognition models, such as to provideinformation on a product to a prospective customer on a mobile device,many images of the product are needed from different angles. The currentindustry process for image collection to create machine learningdatasets is simple image scraping. Scraping is a process of extractinglarge amounts of information from a website or websites. The downloadedcontent may include text, images, full HTMLs, or combinations thereof.

Unfortunately, the approach is limited, because the user does not havecontrol over which images are available, or the quality of those images.That is, no control is available over a number of factors, including:image quality, lighting conditions, angles, environments where imageswere taken. Further, products that are newly released may do not haveenough (or any) images available online to allow for development of aproduct recognition model. All of the foregoing results in subpar modelswith lower than expected prediction results.

Therefore, the need exists for an apparatus, system and method to enablea prospective purchaser to understand, with minimal effort by the user,the availability of a product from a particular retailer, such aswherein the availability information includes the pricing and productdata from that retailer.

SUMMARY OF THE DISCLOSURE

The embodiments are and include at least an apparatus, system and methodfor an object-recognizing retail purchase system. The apparatus, systemand method may include an automatically adjustable camera rig comprisinga plurality of movable cameras, wherein the plurality of movable camerasare automatically moved by a camera control platform according tocharacteristics of an object within a view field of the plurality ofmovable cameras; a first input for receiving images from theautomatically adjustable camera rig; and a second input for receiving aplurality of scraped network images regarding a plurality of purchasableobjects. Additionally included may be a first computing memory forstoring an object profile for each of the plurality of purchasableobjects, wherein each of the object profiles comprises at least datafrom the first input regarding the object within the field of view anddata from the second input; and a purchasing platform at least partiallypresent on a mobile device and comprising at least one computerprocessor having resident thereon non-transitory computing code.

The purchasing platform causes to be performed the steps of: receivingan image of a viewed object within a view field of a mobile devicecamera of the mobile device; gray-scaling the image of the viewedobject; and comparing the gray-scaled image to ones of the objectprofiles until a matched product is obtained. A purchase link suitableto enable a purchase of the matched product from at least one thirdparty is then provided.

Thus, the embodiments provide an apparatus, system and method to enablea prospective purchaser to understand, with minimal effort by the user,the availability of a product from a particular retailer, such aswherein the availability information includes the pricing and productdata from that retailer.

BRIEF DESCRIPTION OF THE DRAWINGS

The disclosure is illustrated by way of example and not limitation inthe accompanying drawings, in which like references may indicate similarelements, and in which:

FIG. 1 is an illustration of an aspect of the embodiments;

FIGS. 2A, B and C are illustrations of aspects of the embodiments;

FIG. 3 is an illustration of an aspect of the embodiments;

FIG. 4 is an illustration of an aspect of the embodiments;

FIG. 5 is an illustration of an aspect of the embodiments;

FIG. 6 is an illustration of a processing system;

FIG. 7 illustrates aspects of the embodiments; and

FIGS. 8A and 8B illustrate aspects of the embodiments.

DETAILED DESCRIPTION

The figures and descriptions provided herein may have been simplified toillustrate aspects that are relevant for a clear understanding of theherein described devices, systems, and methods, while eliminating, forthe purpose of clarity, other aspects that may be found in typicalsimilar devices, systems, and methods. Those of ordinary skill mayrecognize that other elements and/or operations may be desirable and/ornecessary to implement the devices, systems, and methods describedherein. But because such elements and operations are well known in theart, and because they do not facilitate a better understanding of thepresent disclosure, a discussion of such elements and operations may notbe provided herein. However, the present disclosure is deemed toinherently include all such elements, variations, and modifications tothe described aspects that would be known to those of ordinary skill inthe art.

The terminology used herein is for the purpose of describing particularexample embodiments only and is not intended to be limiting. Forexample, as used herein, the singular forms “a”, “an” and “the” may beintended to include the plural forms as well, unless the context clearlyindicates otherwise. The terms “comprises,” “comprising,” “including,”and “having,” are inclusive and therefore specify the presence of statedfeatures, integers, steps, operations, elements, and/or components, butdo not preclude the presence or addition of one or more other features,integers, steps, operations, elements, components, and/or groupsthereof. The method steps, processes, and operations described hereinare not to be construed as necessarily requiring their performance inthe particular order discussed or illustrated, unless specificallyidentified as an order of performance. It is also to be understood thatadditional or alternative steps may be employed.

When an element or layer is referred to as being “on”, “engaged to”,“connected to” or “coupled to” another element or layer, it may bedirectly on, engaged, connected or coupled to the other element orlayer, or intervening elements or layers may be present. In contrast,when an element is referred to as being “directly on,” “directly engagedto”, “directly connected to” or “directly coupled to” another element orlayer, there may be no intervening elements or layers present. Otherwords used to describe the relationship between elements should beinterpreted in a like fashion (e.g., “between” versus “directlybetween,” “adjacent” versus “directly adjacent,” etc.). As used herein,the term “and/or” includes any and all combinations of one or more ofthe associated listed items.

Although the terms first, second, third, etc., may be used herein todescribe various elements, components, regions, layers and/or sections,these elements, components, regions, layers and/or sections should notbe limited by these terms. These terms may be only used to distinguishone element, component, region, layer or section from another element,component, region, layer or section. That is, terms such as “first,”“second,” and other numerical terms, when used herein, do not imply asequence or order unless clearly indicated by the context. Thus, a firstelement, component, region, layer or section discussed below could betermed a second element, component, region, layer or section withoutdeparting from the teachings of the exemplary embodiments.

Processor-implemented modules, systems and methods of use are disclosedherein that may provide access to and transformation of a plurality oftypes of digital content, including but not limited to video, image,text, audio, metadata, algorithms, interactive and document content, andwhich track, deliver, manipulate, transform, transceive and report theaccessed content. Described embodiments of these modules, systems andmethods are intended to be exemplary and not limiting. As such, it iscontemplated that the herein described systems and methods may beadapted and may be extended to provide enhancements and/or additions tothe exemplary modules, systems and methods described. The disclosure isthus intended to include all such extensions.

As mentioned above, it is highly desirable that object recognition, suchas using a mobile device 102, be available for retail mobileapplications 104. The mobile device should thus enable a user to capturea product image 106, have that image be recognized by a retail mobileapplication 104, and have the correct product, as reflected in thepicture taken by the user, offered to the user for purchase 108, such asfrom one retailer providing an “app” or from multiple retailers to allowfor comparison shopping.

Thereby, the embodiments enable collecting and processing product imagesto be used for image classification and object recognition, such as inretail mobile applications. More specifically, the disclosed solutionprovides control as to how the images are created, resulting in betterprediction results. Parameters may be adjusted to yield exceptionalresults, particularly for new products.

Further, for scraped or new images, automated validation scripts may beused to compare prediction results to previously created models, therebyenabling a learning model that provides continuously improvingpredictions unknown with prior image scraping techniques. Accordingly,the embodiments yield datasets in a way that is scalable and whichyields optimal prediction results.

Simply put, to accurately create image recognition models, many imagesof the product are needed from different angles. To the extentsufficient images are unavailable online to enable the disclosedmodeling to solve this problem, the disclosed apparatus may takenumerous pictures per product by rotating the product 360° on a rotatingdisk with cameras pointed at it from varying vertical viewpoints.

As a threshold issue, at least two parallel data stores 120, 122 may bepresent to enable the aforementioned embodiments. The first data store120 comprises a collection of product images matched to digital datarelated to the product pictured, as compared and assessed by acomparator 130, such as a software or firmware comparator. That is, thefirst data store comprises product recognition and consequent dataassociated with the recognized product. The second data store 122comprises purchase information/sale information related to each productrecognizable via the first data store.

As will be appreciated by the skilled artisan, the first data store maybe unique to each or a subset of product sellers, or may be availablevia a so-called “white label” to one or many product sellers, such as ina cloud-based availability in accordance with payment of a subscriptionfee, by way of non-limiting example. The second data store may, inpreferred embodiments, be unique to each seller of goods. That is, thesecond data store may be unique in each offered retail application, suchas wherein each seller sets pricing, maximums/minimums, and/oravailability information for the products in the first data store thatare available from that seller. The first and second data store 120, 122are illustrated in the example of FIG. 1.

As a threshold issue, the image recognition model or models that providethe information that allows for matching of a pictured product to thefirst data store necessarily requires one or more images of any productthat is to comprise the first data store. By way of example, multipleimages of a product may be necessary from different angles, such as toaccount for the fact that a user may capture an image of the object atone or more “off” angles from a “head-on” image. Thus, an imagerecognition model may be generated using at least two methods: in thefirst method, many images may be taken of each product desired forinclusion in the first data store; in a second methodology, existingpictures of a product may be “scraped” from one or multiple locations,such as locations that are publicly or privately available, wherein thescraped images may be combined into a unique image recognition model foreach product to be included in the first data store.

In the first method, many images, such as tens, hundreds, or thousandsof images per product at various angles, such as up to 360° verticallyand 360° horizontally, may be captured. Several challenges exist in thisregard. A robust camera stand must take pictures of products fromdifferent angles of varying size. The camera(s) and stand must beautomated, and must be controllable locally or remotely, such as via acentral web application. Thus, the local interface may include localprocessing to execute local or remote instructions on the localhardware.

Automation may include focusing cameras properly and in the correctdirection for different size objects without needing to continuallymanually adjust the stand and cameras. The application should includequality control comparisons to ensure that the images are of the correctquality and enough variety is obtained for the creation of a robustmodel.

In order to provide for the foregoing, FIGS. 2A, 2B and 2C illustrate animage rig that allows for pictures of various products of varying sizeand multiple angles from a camera. Of note, the camera and rigillustrated may be locally controlled, either automatically or manually,or may be controlled remotely, such as via the aforementioned web ormobile application provided uniquely to each retailer, oradministratively controlled by a provider of a subscription servicewhich enables access to images on a per product basis.

Of note, the camera(s) 202 illustrated may be manually or automaticallyfocused/moved 204 in the correct direction or directions to account fordifferent size objects 206 nearer to and farther from the camera(s), andthis focus may be controlled locally or remotely as discussedthroughout. Moreover, the illustrated image collection model may includea quality control aspect or application, as referenced throughout, toensure that the images collected are of an acceptable quality to providefor machine learning that creates the image recognition models discussedthroughout.

Alternatively, a plurality of images may be scraped from availablesources, such as Google images, Amazon, available search engines, andthe like, to provide a training data set to enable machine learning. Byway of example, a convolutional neural model 302 may execute upon aGoogle images query 304. This is illustrated in the example of FIG. 3.In such an instance, images and/or image URLs 306 may be captured, andrelevant images marked and captured/returned, either manually or using acode-based object recognition, such as to mark all images related to aparticular searched product as results 310. It goes without saying thatquality control, as referenced above, may be applied to imagesautomatically, such as to ensure image files are not corrupted and aresuitable for opening by a machine learning model.

In short, images may be uploaded, such as from scraping or from theautomated capture system, and grouped into cloud storage containers withthe appropriate product references responsive to the searches/groupingsassessed. As such, the images may be categorically and hierarchicallyarchived 320, such as in a cloud storage facility, for access onceneeded in the future. This reduces the cost of long term storage for theplethora of images.

Thereafter, machine learning may be applied to the relevant data set inorder to “learn” suitable images for comparison to a given individualproduct. For example, a cloud batch process may be triggered to createthe models from the captured/stored images, such as in formats ready foruse on iOS and Android devices. Of course, it will be understood thatthe machine learning discussed herein may be used in conjunction withone or more image-building applications, such that unavailable images,such as at particular angles or in particular lighting, may beextrapolated by the machine learning model based on “approved” images ofa particular product.

In either of the foregoing cases, once a relevant image collection iscreated for a given product, that product and its corresponding imagesmay be hierarchically categorized, such as into a limited number ofavailable categories into which products may be placed, either manuallyor automatically. Thereby, the disclosed object recognition model thatemploys the image collection may engage in refined processing, at leastin that the machine learning model enables the recognition of at leastone or more broad categories into which an imaged product may beproperly placed, such that further drill downs to the images of theproduct for comparison by the object recognition model may be limitedbased on a predefined rule set in the object recognition model. By wayof example, it may be evident from a captured image that the imageproduct is a kitchen appliance. Thereafter, the machine learning modelmay limit the available drill downs for the object recognition modelonly to those images that clearly illustrate kitchen appliances. Thatis, the object recognition model may be enabled to drill down toultimately conclude that the image is of a particular type of toaster,but may not be enabled by the initial machine learning to drill downinto a hierarchy of products that comprise mufflers for automobiles.

Needless to say, both the object categorization and the objectrecognition algorithms discussed herein throughout may necessitatetraining of the machine learning. By way of particular example, aftergenerating enough images in each product to create a viableproduct-specific object recognition model, a training algorithm may berun to create, for example, both a CoreML (iPhone) and Python compatiblemodel.

In the foregoing instance, the Python compatible model may be usedprimarily for testing against previous iterations to determine how toproceed with generating new models. Further, as the last step of theAPI, the CoreML model may be uploaded to an AWS bucket so that users ofthe iPhone app have the latest products available for scanning.

The embodiments provide an accurate visual recognition machine learningmodel that provides optimal accuracy while keeping the model size andcomputer processing times to a minimum, as discussed throughout. It ispreferable, to allow for execution of retail applications as discussedthroughout, that the employed models are compatible with native visualrecognition libraries on iOS and Android devices.

More particularly, an object recognition algorithm used in thedisclosure may gray scale the received image, thereby decreasing thesize of the image. The algorithm then creates numerous boxes which scanthe image for haar-like features and patterns detected in the receivedpixels by assessing darker versus lighter color. After many images ofthe same object have been observed as referenced above, common haar-likefeatures may thereafter be recognized as attributable to that object.Accordingly, an inputted photo may be scanned and its haar-like featuresused to signal a match with the closest matching features of all theobjects in the model.

Of course, rather than matching across all models, size, shape, and thelike may first be used by the model to assess a category of iteminitially, whereafter the item in the picture may be matched by categoryto thus limit processing needed. For example, the iOS ARKit allows forcalculation of the dimensions of a product. That is, utilizing themobile device camera and native capabilities, the statistics of adetected object may be returned. A plurality of position vectors maythereby be extracted and compared with other vectors, such as those ofthe stored product(s).

Although more processing intensive dependent upon the level of detail,such an analysis may account for even small percentage issues asfollows. The average length and width may be calculated by finding everylength and width (whole width and height) and dividing by the number oflengths and widths. Trigonometric features, such as the tangent orarc-tangent, may also be used to account for the displaced depth valueto provide a more accurate reading of object size. Once depthdisplacement is accounted for, the Pythagorean theorem may be used todetermine a final width and height of the object, for example.

Further and by way of example, the height component may additionally beunitized to form a unit vector in the y direction, and the magnitude ofthe width component may get divided by the previous height vectormagnitude to form a ratio between height and width, so that the sameobject at a different distance will produce the same result. The newwidth ratio value may then be compared to the stored width ratio values(calculated actual values of objects) and the closest match may bereturned.

The foregoing analyses is provided by way of example, and as such othertechniques, such as bounding box detection, may be implemented withoutdeparting from the spirit of the disclosure. Additionally and needlessto say, the foregoing refined analyses may also be used to delineatebetween highly similar products.

Moreover, the created models may run locally on native retailapplications, i.e., completely offline and with no server costs.Alternatively, the models may run partially or fully as thin clientapplications with only the user-interface natively provided.

Once the object recognition model is enabled by the machine learning andimage collection discussed herein, the object recognition model mayoperate to recognize the object in an image while engaging in optimallyminimal levels of processing. For example, an object recognition modelmay include a rule set that initially seeks only key features, such asedges, contours, colors, the presence of controls, the presence ofelectronic readouts, or the like in order to recognize those featuresand initially categorize the product in the image. These common featuresmay be recognized and attributed to the object by the object recognitionmodel such that an initial category and available drill downs areattributed to the product in the image. Upon drill down, additionalmatching features may be discerned by the object recognition model suchthat the closest matching features of the objects in the model arepaired with the imaged object.

Of course, the skilled artisan will appreciate in light of thediscussion herein that the image recognition model may encounterdifficulty in differentiating between very similar products. In theevent of such highly similar products, the object recognition model may,such as only in such circumstances, be enabled to engage in much moresignificant processing. For example, the object image may be assigned alarge number of positional vectors, including length, width, height,depth, density, angles, and any other aspects that may be assessed in animage, including trigonometric features related to the imaged object, byway of non-limiting example. All and/or each of these refined vectorsmay be compared to these refined vectors in the image store of theobject recognition model. It will be appreciated that, in preferredembodiments, this level of refined vector comparison should be avoided,to the extent possible, in order to minimize and expedite processing ofa captured image. FIG. 4 illustrates a processing system executing codeto, in part, perform a refined vector comparison as discussed above.

It will be appreciated that limited, some, or all aspects of the abovecomparisons may be locally or remotely associated with a particularretail application. By way of example, only the broad categorizationaspect of the object recognition model disclosed may be locally providedin a retail application on a mobile device. Thereafter, once providedwith the category, a user may be enabled to perform a manual drill downto assess the imaged product, so that extensive processing and/or localstorage of the refined vector analysis discussed above may be avoided.Similarly, only certain categories of products for the first data storemay be available in local memory. Thus, for example, automobile tireinformation for the first data store may be stored locally, at leastbecause of the comparatively limited number of tire manufacturers andtypes of tires that such data storage must account for. However, otherproduct types for the first data store, such as chairs, which have alarge number of manufacturers and an even larger number of types andsizes, may be available only remotely.

As illustrated in FIG. 5, once an object in an image is recognized 502by the object recognition model 504, the information from the first datastore 506 is associated with that object 506, and, to the extent thesecond data store indicates that product is for sale from the relevantseller 510, that product may be provided to the user in an application,such as on a mobile device, as available for purchase by the user 512(see also, e.g., FIG. 7). It will be understood in light of thediscussion herein that the application referenced may have available toit one or more APIs that allow for variations and variable uses of theaspects discussed herein throughout. For example, the API may allow foran application available on multiple platforms, such as android and iOS,a selection of categories of products to be made available for purchasethrough the application, the level of user interaction available to theuser, such as whether the user is enabled to drill down manually throughthe hierarchical menu discussed herein, is to receive a presentation ofa limited number of products to select from as the one imaged by theuser, or if the user is to receive only the “best guess” productprovided by the disclosed embodiments, and so on. It will also beunderstood that the API may allow for selection by the applicationprovider of whether the user must affirmatively image from within theapplication in order to be provided with a purchasable product, or cansimply point the camera and at an object and have the applicationautomatically capture the image and discern the product to be providedfor purchase.

More specifically, the API may enable a native iOS Application. Thenative iOS application may take input from a live video feed, and mayrun each frame through the disclosed object modeling, resulting in aprediction of what object is recognized in the image. This process maybe performed locally on the device, and thus may be available foroffline use. Results may be rendered in the UI in an augmented reality(AR) call-out label, for example, which may appear above each product ina 3D presentation space. A link may embedded in the call-out, which,when activated, may direct a user to a website where she can purchasethe product. The foregoing is illustrated by way of example in FIGS. 8Aand 8B.

FIG. 6 depicts an exemplary computer processing system 1312 for use inassociation with the embodiments, by way of non-limiting example.Processing system 1312 is capable of executing software, such as anoperating system (OS), applications, user interface, and/or one or moreother computing algorithms/applications 1490, such as the recipes,models, programs and subprograms discussed herein. The operation ofexemplary processing system 1312 is controlled primarily by thesecomputer readable instructions/code 1490, such as instructions stored ina computer readable storage medium, such as hard disk drive (HDD) 1415,optical disk (not shown) such as a CD or DVD, solid state drive (notshown) such as a USB “thumb drive,” or the like. Such instructions maybe executed within central processing unit (CPU) 1410 to cause system1312 to perform the disclosed operations, comparisons and calculations.In many known computer servers, workstations, personal computers, andthe like, CPU 1410 is implemented in an integrated circuit called aprocessor.

It is appreciated that, although exemplary processing system 1312 isshown to comprise a single CPU 1410, such description is merelyillustrative, as processing system 1312 may comprise a plurality of CPUs1410. Additionally, system 1312 may exploit the resources of remote CPUs(not shown) through communications network 1470 or some other datacommunications means 1480, as discussed throughout.

In operation, CPU 1410 fetches, decodes, and executes instructions froma computer readable storage medium, such as HDD 1415. Such instructionsmay be included in software 1490. Information, such as computerinstructions and other computer readable data, is transferred betweencomponents of system 1312 via the system's main data-transfer path. Themain data-transfer path may use a system bus architecture 1405, althoughother computer architectures (not shown) can be used.

Memory devices coupled to system bus 1405 may include random accessmemory (RAM) 1425 and/or read only memory (ROM) 1430, by way of example.Such memories include circuitry that allows information to be stored andretrieved. ROMs 1430 generally contain stored data that cannot bemodified. Data stored in RAM 1425 can be read or changed by CPU 1410 orother hardware devices. Access to RAM 1425 and/or ROM 1430 may becontrolled by memory controller 1420.

In addition, processing system 1312 may contain peripheralcommunications controller and bus 1435, which is responsible forcommunicating instructions from CPU 1410 to, and/or receiving data from,peripherals, such as peripherals 1440, 1445, and 1450, which may includeprinters, keyboards, and/or the operator interaction elements on amobile device as discussed herein throughout. An example of a peripheralbus is the Peripheral Component Interconnect (PCI) bus that is wellknown in the pertinent art.

Operator display 1460, which is controlled by display controller 1455,may be used to display visual output and/or presentation data generatedby or at the request of processing system 1312, such as responsive tooperation of the aforementioned computing programs/applications 1490.Such visual output may include text, graphics, animated graphics, and/orvideo, for example. Display 1460 may be implemented with a CRT-basedvideo display, an LCD or LED-based display, a gas plasma-basedflat-panel display, a touch-panel display, or the like. Displaycontroller 1455 includes electronic components required to generate avideo signal that is sent to display 1460.

Further, processing system 1312 may contain network adapter 1465 whichmay be used to couple to external communication network 1470, which mayinclude or provide access to the Internet, an intranet, an extranet, orthe like. Communications network 1470 may provide access for processingsystem 1312 with means of communicating and transferring software andinformation electronically. Additionally, communications network 1470may provide for distributed processing, which involves several computersand the sharing of workloads or cooperative efforts in performing atask, as discussed above. Network adaptor 1465 may communicate to andfrom network 1470 using any available wired or wireless technologies.Such technologies may include, by way of non-limiting example, cellular,Wi-Fi, Bluetooth, infrared, or the like.

More particularly, and with reference to FIGS. 5 and 6, the embodimentsmay provide a cloud environment, such as with an accompanying webdashboard, to control the camera(s), review scan results, mody objectrecognition, and provide any “back end” for a retail or white label“app.” The dashboard may include private, password protection, andadministrator controls.

The interface may enable input, categorization, and control of productdata. The interface may interact with the cloud infrastructure tocreate, rename and delete image storage containers. The interface mayoversee the automated batch processes for the creation of the machinelearning model.

As referenced throughout, the computing system of FIG. 6 may enable theprovision of the disclosed API. The API may include an image processingpipeline which, after completing the recording of a video, such as viathe GigE Industrial Cameras6, receive data in, for example, a C++handled component wrapped around the GigE industrial Cameras SDK7. Thedata streamed through this component, with help from GStreamer8, maythen be sent to a Python server for processing.

With the videos downloaded and ready to be processed, the imageprocessing pipeline, capable of video splitting, segmentation, masking,may actuate as follows. For video splitting and to reduce the volume ofimages, frames may be pulled from a video on a step basis. Thetranslation of disk rotation speed and degree separation may yield thestep value.

For image segmentation, the background may be removed from the focalobject by the processing. Segmenting the background from the objectenables processing of the products regardless of their color. Thisprocess is thus different than masking pixel color (e.g., using a greenscreen), and instead uses machine learning models to predict the focalobject. Using the disclosed method rather than a green screen provides agreater predictive result and automates the enrollment process.

The pipeline step provides the ability to find the bounding box of theproduct being scanned. For image classification, this allows cropping ofthe area surrounding the object, which better enables training of theclassification models. For object detection, this step enables theability to train machine learning models to look for the object in animage rather than simply identifying if an image is a product.

Masking may be performed to segment the image onto selectively chosenbackgrounds. Thereby, depending on the model then under build, theprocessing can rotate, brighten, randomize location, resize, and blurthe segmented image to produce a pool of images from which algorithmscan be efficiently trained.

In sum, image classification differs from object detection, and so theinput data used to create the model(s) also differs. The API may thuscomprise the ability to create both, with the distinctions handled inthe web dashboard.

In the foregoing Detailed Description, it can be seen that variousfeatures are grouped together in a single embodiment for the purpose ofclarity and brevity of the disclosure. This method of disclosure is notto be interpreted as reflecting an intention that the embodimentsrequire more features than are expressly recited herein. Rather, thedisclosure is to encompass all variations and modifications to thedisclosed embodiments that would be understood to the skilled artisan inlight of the disclosure.

What is claimed is:
 1. An object-recognizing retail purchase system,comprising: an automatically adjustable camera rig comprising aplurality of movable cameras, wherein the plurality of movable camerasare automatically moved by a camera control platform according tocharacteristics of an object within a view field of the plurality ofmovable cameras; a first input for receiving images from theautomatically adjustable camera rig; a second input for receiving aplurality of scraped network images regarding a plurality of purchasableobjects; a first computing memory for storing an object profile for eachof the plurality of purchasable objects, wherein each of the objectprofiles comprises at least data from the first input regarding theobject within the field of view and data from the second input; apurchasing platform at least partially present on a mobile device andcomprising at least one computer processor having resident thereonnon-transitory computing code which, when executed by the at least onecomputing processor, causes to be performed the steps of: receiving animage of a viewed object within a view field of a mobile device cameraof the mobile device; gray-scaling the image of the viewed object;comparing the gray-scaled image to ones of the object profiles until amatched product is obtained; providing a purchase link suitable toenable a purchase of the matched product from at least one third party.2. The object-recognizing retail purchase system of claim 1, wherein thecamera control platform is remote from the camera rig.
 3. Theobject-recognizing retail purchase system of claim 1, wherein the camerarig comprises a semicircular camera base.
 4. The object-recognizingretail purchase system of claim 1, wherein the third party is aretailer.
 5. The object-recognizing retail purchase system of claim 1,wherein the purchase link comprises a plurality of purchase links to aplurality of third parties.
 6. The object-recognizing retail purchasesystem of claim 1, wherein the data to the first input comprisesgray-scaling.
 7. The object-recognizing retail purchase system of claim1, wherein the network comprises the cloud.
 8. The object-recognizingretail purchase system of claim 1, wherein the mobile device comprises aproprietary operating system.
 9. The object-recognizing retail purchasesystem of claim 8, wherein the proprietary operating system comprisesiOS.
 10. The object-recognizing retail purchase system of claim 8,wherein the proprietary operating system comprises Android.
 11. Theobject-recognizing retail purchase system of claim 1, wherein the imagescraping to the second input is responsive to a manual computing search.12. The object-recognizing retail purchase system of claim 1, whereinthe purchasing platform is fully present on the mobile device.
 13. Theobject-recognizing retail purchase system of claim 12, wherein thepurchasing platform comprises an app.
 14. The object-recognizing retailpurchase system of claim 1, wherein the image of the viewed object ofthe mobile device comprises a moving image.
 15. The object-recognizingretail purchase system of claim 1, wherein the image of the viewedobject of the mobile device comprises a single view.
 16. Theobject-recognizing retail purchase system of claim 1, wherein the imageof the viewed object is directed manually.
 17. The object-recognizingretail purchase system of claim 1, wherein the object profiles arestored and accessed categorically.
 18. The object-recognizing retailpurchase system of claim 17, wherein the categories are manuallyaccessible via a hierarchical menu.
 19. The object-recognizing retailpurchase system of claim 18, wherein the hierarchical menu comprises adrop down menu.
 20. The object-recognizing retail purchase system ofclaim 1, wherein the provided purchase link comprises a confidence levelof the purchasable product.